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1 Abstract 

2 Recently, a number of protocols extending RNA-sequencing to the 

3 single-cell regime have been published. However, we were concerned that 

4 the additional steps to deal with such minute quantities of input sam- 

5 pie would introduce serious biases that would make analysis of the data 

6 using existing approaches invalid. In this study, we performed a critical 

7 evaluation of several of these low-volume RNA-seq protocols, and found 

8 that they performed slightly less well in metrics of interest to us than a 

9 more standard protocol, but with at least two orders of magnitude less 

10 sample required. We also explored a simple modification to one of these 

11 protocols that, for many samples, reduced the cost of library preparation 

12 to approximately $20/sample. 



is 1 Introduction 

14 Second-generation sequencing of RNA (RNA-seq) has proven to be a sensitive 

is and increasingly inexpensive approach for a number of different experiments, 

16 including annotating genes in genomes, quantifying gene expression levels in a 

17 broad range of sample types, and determining differential expression between 
is samples. As technology improves, transcriptome profiling has been able to be 

19 applied to smaller and smaller samples, allowing for more powerful assays to 

20 determine transcriptional output. For instance, our lab has used RNA-seq on 

21 single Drosophila embryos to measure zygotic gene activation [18] and medium- 

22 resolution spatial patterning [4]. Further improvements will allow an even 

23 broader array of potential experiments on samples that were previously too 

24 small. 

25 For instance, over the past few years, a number of groups have published de- 

26 scriptions of protocols to perform RNA-seq on single cells (typically mammalian 
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27 cells) [25, 23, 24, 11, 14]. A number of studies, both from the original authors 

28 of the single-cell RNA-seq protocols and from others, have assessed various as- 

29 pects of these protocols, both individually and competitively [2, 27, 19]. One 

30 particularly powerful use of these approaches is to sequence individual cells in 

31 bulk tissues, revealing different states and cellular identies [3, 26]. 

32 However, we felt that published descriptions of single-cell and other low- 

33 volume protocols did not adequately address whether a change in concentration 

34 of a given RNA between two samples would result in a proportional change 

35 in the FPKM (or any other measure of transcriptional activity) between those 

36 samples. While there are biases inherent to any protocol, we were concerned 

37 that direct amplification of the mRNA would select for PCR compatible genes 

38 in difficult to predict, and potentially non-linear ways. For many of the pub- 

39 lished applications of single cell RNA-seq, this is not likely a critical flaw, since 

40 the clustering approaches used are moderately robust to quantitative changes. 

41 However, to measure spatial and temporal activation of genes across an embryo, 

42 it is important that the output is monotonic with respect to concentration, and 

43 ideally linear. 

44 While it is possible to estimate absolute numbers of cellular RNAs from an 

45 RNAseq experiment, doing so requires spike-ins of known concentration and 

46 estimates of total cellular RNA content [21, 17]. However, many RNA-seq ex- 

47 periments do not do these controls, nor are such controls strictly necessary un- 

48 der reasonable, though often untested, assumptions of approximately constant 

49 RNA content. While ultimately absolute concentrations will be necessary to 
so fully predict properties such as noise tolerance of the regulatory circuits [9, 8], 

51 many current modeling efforts rely only on scaled concentration measurements, 

52 often derived from in siiw-hybridization experiments [7, 13, 12]. Given that, we 

53 felt it was not important that different protocols should necessarily agree on 

54 any particular expression value for a given gene, nor are we fully convinced that 

55 absolute expression of any particular gene can truly reliably be predicted in a 

56 particular experiment. 

57 In order to convince ourselves that data generated from limiting samples 

58 would be suitable for our purposes, we evaluated several protocols for perform- 

59 ing RNA-seq on extremely small samples. We also investigated a simple modifi- 
eo cation to one of the protocols that reduced sample preparation cost per library 
ei by more than 2-fold. Finally, we evaluated the effect of read depth on quality of 

62 the data. This study provides a single, consistent comparison of these diverse 

63 approaches, and shows that in fact all data from the low-volume protocols we 

64 examined are usable in similar contexts to the earlier bulk approach. 

65 2 Results 

ee 2.1 Experiment 1: Evaluation of Illumina TruSeq 

e? In our hands, the Illumina TruSeq protocol has performed extremely reliably 

es with samples on the scale of lOOng of total RNA, the manufacturer recom- 
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mended lower limit of the protocol. However, attempts to create libraries from 
much smaller samples yielded low complexity libraries, corresponding to as much 
as 30-fold PCR duplication of fragments. Anecdotally, less than 5% of libraries 
made with at least 90ng of total RNA yielded abnormally low concentrations, 
which we observed correlated with low complexity (Data not shown). To deter- 
mine the lower limit of input needed to reliably produce libraries, we attempted 
to make libraries from 40, 50, 60, 70, and 80 ng of Drosophila total RNA, each 
in triplicate. 

Table 1: Total TruSeq cDNA library yields made with a given amount of input 
total RNA. Yields measured by Nanodrop of cDNA libraries resuspended in 
25/iL of EB. The italicized samples were unusually low, and when analyzed 
with a Bioanalyzer, showed abnormal size distribution of cDNA fragments. 



Amount Input RNA 


Replicate A 


Replicate B 


Replicate C 


40 ng 


57 ng 


425 ng 


672 ng 


50 ng 


435 ng 


768 ng 


755 ng 


60 ng 


115 ng 


663 ng 


668 ng 


70 ng 


300 ng 


593 ng 


653 ng 


80 ng 


468 ng 


550 ng 


840 ng 



We considered the two libraries with lower than usual concentration to be 
failures. While a failure rate of approximately 1 in 3 might be acceptable for 
some purposes, we ultimately wanted to perform RNA sequencing on precious 
samples, where a failure in any one of a dozen or more libraries would neces- 
sitate regenerating all of the libraries. Furthermore, due to the low sample 
volumes involved (less than approximately 500pg of poly-adenylated mRNA), 
common laboratory equipment is not able to determine the particular point in 
the protocol where the failures occurred. 

Thus, we consider 70 ng of total RNA to be the conservative lower limit to 
the protocol. While this is about 30% smaller than the manufacturer suggests, it 
is still several orders of magnitude larger than we needed it to be. We therefore 
considered using other small-volume and "single-cell" RNA-seq kits, which we 
had less experience with and less faith in the data. 

2.2 Experiment 2: Competitive Comparison of Low- volume 
RNAseq protocols 

We first sought to determine whether the low-volume RNAseq protocols avail- 
able faithfully recapitulate linear changes in abundance of known inputs. We 
generated synthetic spike-ins by combining D. melanogaster and D. virilis total 
RNA in known, predefined proportions of 0, 5, 10, and 20% D. virilis RNA. For 
each of the low-volume protocols, we used lng of total RNA as input, whereas 
for the TruSeq protocol we used lOOng. 
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98 Although pre-defined mixes of spike-in controls have been developed and are 

99 commercially available [15], we felt it was important to ensure that a given pro- 

100 tocol would function reproducibly with natural RNA, which almost certainly has 

101 a different distribution of 6-mers, which could conceivably affect random cDNA 

102 priming and other amplification effects. Furthermore, our spike-in sample more 

103 densely covers the approximately 10 5 fold coverage typical of RNA abundances. 

104 It should be noted, however, that our sample is not directly comparable to any 

105 other standards, nor is the material of known strandedness. We assumed that 

106 the majority of each sample is from the standard annotated transcripts, but did 

107 not verify this prior to library construction and sequencing. 

los The different protocols had a variation in yield of libraries from between 

109 6 fmole (approximately 3.6 trillion molecules) and 2,400 fcmtomoles, with the 

no TruSeq a clear outlier at the high end of the range, and the other protocols all 

in below 200 fmole (Table 2.2). All of these quantities are sufficient to generate 

u2 hundreds of millions of reads — far more than is typically required for an RNA- 

n3 seq experiment. We pooled the samples, attempting equimolar fractions in the 

n4 final pool; however, due to a pooling error, we generated significantly more reads 

us than intended for the TruSeq protocol, and correspondingly fewer in the other 

lie protocols. Unless otherwise noted, we therefore sub-sampled the mapped reads 

n7 to the lowest number of mapped reads in any sample in order to provide a fair 

us comparison between protocols. 

n9 We were interested in the fold-change of each D. virilis gene across the four 

120 samples, rather than the absolute abundance of any particular gene. Therefore, 

121 after mapping and gene quantification, we normalized the abundance Aij of 

122 every gene i across the j = 4 samples by a weighted average of the quantity Qj 

123 of D. virilis in sample j, as show in equation 1. Thus, within a given gene, a 

124 linear fit of A\j vs Qj should have a slope of one and an intercept of zero. 

126 We filtered the D. virilis genes for those with at least 20 mapped fragments 

127 in the sample with 20% D. virilis, then calculated an independent linear re- 

128 gression for each of those genes. As expected, for every protocol, the mean 

129 slope was 1 (t-test, p < 5 x 10^ 7 for all protocols). Similarly, the average in- 

130 tercepts for all protocols was 0 (t-test, p < 5 x 10~ 7 for all protocols). Also 

131 unsurprisingly, the TruSeq protocol had a noticeably higher mean correlation 

132 coefficient (0.98 ± 0.02) than any of the other protocols (0.95 ± 0.06, 0.92 ± 0.09, 

133 and 0.95 ±0.06 for Clontech, TotalScript, and SMART-seq2, respectively). The 

134 mean correlation coefficient was statistically and practically indistinguishable 

135 between the Clontech samples and the SMART-seq2 samples (t-test p = .11, 

136 Figure 2.2). 

137 Indeed, the only major differentiator we could find between the low-volume 

138 protocols we measured was cost. For only a handful of libraries, the kit-based 

139 all inclusive model of the Clontech and TotalScript kits could be a significant 

140 benefit, allowing the purchase of only as much of the reagents as required. By 
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141 contrast, the Smart-seq2 protocol requires the a la carte purchase of a number 

142 of reagents, some of which are not available or more expensive per unit for 

143 smaller quantities. Furthermore, there could potentially be a "hot dogs and 

144 buns" problem, where reagents are sold in non-integer multiples of each other, 

145 leading to leftovers. Many of these reagents are not single-purpose, however, so 
we leftovers could in principle be repurposed in other experiments. 

Table 2: Summary of protocols used in experiments 2 and 3. Cost is estimated 
per sample assuming a large number of libraries at US catalog prices as of May 
2014, and includes RNA extraction. 



Protocol 


Shorthand 


Cost/library 


TruSeq 


TruS 


$45 


Clontech 


CT 


$105 


TotalScript 


TotS 


$115 


Smart-seq2, standard protocol 


SS 


$55 


Smart-seq2, 2.5 fold dilution 


SS— 2.5x 


$28 


Smart-seq2, 5 fold dilution 


SS— 5x 


$20 



147 2.3 Experiment 3: Further modifications to the SMART- 

148 seq2 protocol 

149 Although the SMART-seq2 was the cheapest of the protocols, we wondered 

150 whether it could be performed even more cheaply without compromising data 

151 quality. This would enable us to include more biological replicates in the future 

152 experiments for which we are evaluating these protocols. In the original protocol, 

153 we noticed that roughly 60% of the cost came from the Nextera XT reagents. 

154 Thus, reducing the cost of tagmentation was the obvious goal to target. 

155 We made additional libraries, again starting with lng of total RNA. We 

156 amplified a single set of spike-in samples with 0, 5, 10, and 20% D. virilis 

157 total RNA as in experiment 2, and made a single an additional sample with 

158 1% D. virilis RNA. Starting at the point in the SMART-seq2 protocol where 

159 tagmentation was started, we performed reactions in volumes 2.5 x and 5x 

160 smaller, using proportionally less cDNA as well. Due to the low total yield, we 
lei increased the number of enrichment cycles from 6 to 8 (see methods). 

162 When normalized to the same number of reads as in experiment 2, the 

163 protocols with diluted Nextera reagents performed effectively identically: for 

164 instance, the mean correlation coefficients were in both cases 0.96 ± 0.05 (Fig. 

165 2 and Table 4). This is despite the additional cycles of enrichment, which 
lee improved yield. 

167 Because we used a common set of pre-amplified cDNA samples that was 

168 performed in a distinct pre-amplification from experiment 2, we can estimate 

169 the contribution of that pre-amplification to the overall variation. If, in fact, the 

170 pre-amplification is a major contributor to the variation, then we would expect 

171 to find that the correlation between, for instance, the slopes of two runs of the 
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Experiment 


Protocol 


% D. virilis 


Yield (fmole) 


Reads 


Mapped 


2 


CT 


0 


6.5 


3,803,843 


3,374,520 


2 


55 


5 


15.7 


4,372,738 


4,164,781 


2 


11 


10 


47.4 


10,013,087 


9 ,527,023 


2 


11 


20 


17.8 


4,781,463 


4,317,101 


2 


TotS 


0 


176.8 


3,281,134 


2,930,058 


2 


55 


5 


170.2 


2,498,134 


2,237,330 


2 


11 


10 


102.5 


5,777,523 


5,424,366 


2 


11 


20 


119.9 


6,068,996 


5,740,496 


2 


TruS 


0 


2,401.0 


67,560,511 


64,024,881 


2 




5 


2,001.1 


23,370,854 


22,589,083 


2 




10 


2,174.2 


39,454,390 


38,093,763 


2 




20 


2,379.2 


35,265,536 


34,304,792 


2 


SS2 


0 


34.3 


2,439,518 


2,297,087 


2 




5 


59.6 


2,550,023 


2,419,889 


2 


5! 


10 


67.9 


2,534,628 


2,444,568 


2 




20 


39.8 


2,504,340 


2,389,850 


3 


SS2 — 2.5x 


0 


104.4 


15,769,915 


14,393,959 


3 




1 


124.7 


21,349,748 


20,084,131 


3 


55 


5 


113.0 


17,047,120 


16,329,641 


3 


55 


10 


103.5 


23,762,232 


22,372,562 


3 


55 


20 


123.8 


20,809,781 


20,041,548 


3 


SS2— 5x 


0 


59.4 


19,214,155 


17,324,598 


3 


55 


1 


58.6 


23,832,274 


22,364,220 


3 


55 


5 


65.4 


18,149,452 


17,157,450 


3 


55 


10 


28.8 


15,821,419 


14,869,864 


3 


55 


20 


57.2 


22,466,345 


21,620,603 



Table 3: Sequencing summary statistics for samples. Protocols are the short- 
hands used in table 2. Reads indicates the total number of reads, and Mapped 
the total number of reads that mapped at least once to either genome. Experi- 
ments 2 and 3 were run in a single HiSeq lane each. 
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Table 4: Distribution of fit parameters. A simple linear fit, — m - Qj + b was 
computed for each gene i, and a correlation coefficent r calculated. For brevity, 
x is the mean of some variable x, and a x is its standard deviation. 



Protocol 


m±a m 


b±<7 b 


r±oy 


TruSeq 
Clontech 
TotalScript 
Smart-seq2 
Smart-seq2, 2.5 fold dilution 
Smart-scq2, 5 fold dilution 


1.01±0.0698 

1.01±0.12 
0.952±0.129 

1.03±0.121 
0.996±0.111 

l.OliO.lll 


-0.108±1.05 
-0.217±1.79 
0.715±1.93 
-0.506±1.82 
0.0623±1.67 
-0.173±1.66 


0.98±0.019 
0.95±0.061 
0.93±0.094 
0.95±0.057 
0.96±0.053 
0.96±0.049 



172 same experiment with different pre-amplifications would be significantly lower 

173 than the correlation between the slopes of two runs using the same pre-amplihcd 

174 cDNA pools. 

175 Unsurprisingly, the sets of samples that used the same preamplification were 

176 more correlated with each other than with the set of samples that used a separate 

177 pre-amplification (Fig. 3). By analogy to dual-reporter expression studies[6], we 

178 term variation along the diagonal "extrinsic noise" (rj ext = std(m! + m 2 )), and 

179 variation perpendicular to the diagonal "intrinsic noise" (rji nt — std(mi — TO2)), 

180 being intrinsic to the pre-amplification step. Using that metric, the intrinsic 

181 noise is lower for the samples with the same pre-amplification (rji nt = 0.09) 

182 than for the samples with different pre-amplifications (r]i n t = 0.16). Somewhat 

183 surprisingly, the extrinsic noise is higher for the samples with the same pre- 

184 amplification (j] ext = 0.20 vs rj ext — 0.16), perhaps due to the 2 additional 

185 cycles of PCR enrichment. 

186 3 Discussion 

is? When sample size is not the limiting factor, it is clear that using well-established 

las protocols that involve minimal sequence-specific manipulation of the sample 

189 yields the best results, both in terms of reproducibility and linearity of response. 

190 However, if it is not practical to collect such relatively large samples, we believe 

191 that any of the "single-cell" protocols we have tested should perform similarly, 

192 and can be used as a drop-in replacement. While preamplification steps do 

193 introduce some detectable variance, it is not vastly detrimental to the data 

194 quality, and does not introduce obvious sequence-specific biases. 

195 Such methods should be strongly preferred if it is feasible to collect a suit- 

196 ably homogenous sample. While bulk tissues may be a mixture of multiple 

197 distinct cell types, this may or may not affect the particular research question 

198 an RNAseq experiment is designed to answer. In our hands, the lower limit 

199 of reliable library construction using the Illumina TruSeq kit is approximately 

200 70ng of total RNA; with non precious samples, the practical limit is likely to 



7 



Downloaded from http://biorxiv.org/on September 18, 2014 



201 be even lower. Although we believe there is significant user-to-user variation, it 

202 seems unreasonable to expect order-of-magnitudc improvements are possible in 

203 techniques for precious samples. We suggest that this limit may be related to 

204 cDNA binding to tubes or purification beads, but since the quantities are lower 

205 than the detection threshold of many standard quality control approaches, we 

206 cannot directly verify this, nor do we believe that knowing the precise cause is 

207 likely to suggest remediation techniques. 

208 Compared to the regimes these protocols were designed for, we used a rel- 

209 atively large amount of input RNA — 1 ng of total RNA — corresponding to ap- 

210 proximately 50 nuclei of a mid-blastula transition Drosophila embryo. Previous 

211 studies have shown that this amount of RNA is well above the level where 

212 stochastic variation in the number of mRNAs per cell will strongly affect the 

213 measured expression of a vast majority of genes [19]. It is nevertheless a small 

214 enough quantity to be experimentally relevant. For instance, we have previously 

215 dissected single embryos into approximately 12 sections, yielding approximately 

216 lOng per section[4], and one could conceivably perform similar experiments on 

217 imaginal discs or antennal structures, which contain a similar amount of cells 

218 [16, 10]. 

219 One of the more striking results is that costs can be significantly reduced by 

220 simply performing smaller reactions, without noticeably degrading data quality. 

221 We do not suspect this will be true for arbitrarily small samples, such as from 

222 single cells. Instead, it is likely only true for samples near the high end of the 

223 effective range of the protocol. We have not explored where this result breaks 

224 down, and strongly caution others to verify this independently using small pilot 

225 experiments before scaling up. 

4 Methods 

227 4.1 RNA Extraction, Library Preparation, and Sequenc- 

228 ing 

229 We performed RNA extraction in TRIzol (Life Technologies, Grand Island, NY) 

230 according to manufacturer instructions, except with a higher concentration of 

231 glycogen as carrier (20 ng) and a higher relative volume of TRIzol to the ex- 

232 pected material (1 mL, as in [18] and [4]). We quantified RNA concentrations 

233 using a fluorometric Qubit RNA HS assay (Life Technologies). 

234 TruSeq libraries were prepared with the "TruSeq RNA Sample Preparation 

235 Kit v2" (Illumina Cat.#RS-122-2001) according to manufacturer instructions, 

236 except for the following modifications. All reactions were performed in half 

237 the volume of reagents. We find that this increases the effective concentration 

238 of RNA and cDNA. We performed all reactions and cleanups in 8-tube PCR 

239 strip tubes, which allowed us to reduce the volume of Resuspension Buffer to 

240 minimize volume left behind after each cleanup. 

241 Clontech libraries were prepared with the "Low Input Library Prep Kit" 

242 (Clontech Cat. #634947). We generated cDNA by using TruSeq reagents until 
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243 the cDNA synthesis step. Then, we used the Low Input Library Prep Kit to 

244 modify the cDNA into sequencing-competent libraries. We believe that a similar 

245 cDNA synthesis could be performed using oligo dT Dynabeads, RNA fragmen- 

246 tation reagents, and Superscript II (Life Technologies), for an approximate cost 

247 per sample of $15. 

248 TotalScript libraries were prepared with the "TotalScript RNA-Seq Kit" and 

249 "TotalScript Index Kit" (Epicentre Cat.#TSRNAl296 and TSIDX12910). We 

250 followed the manufacturer's instructions, and used the oligo dT priming option. 

251 We performed the mixed priming option in parallel, which yielded approximately 

252 4-fold more library, but did not sequence them due to concerns of ribosomal 

253 contamination. 

254 SMARTseq2 libraries were prepared according to the protocol in Picelli et 

255 a/. (2014) [22]. Because we had already extracted and mixed the RNA, we began 

256 at step 5 with 3.7 [iL of dNTPs and 1 [iL of 37 \xM oligo dT primer, yielding the 

257 same concentration of primer and oligo as originally reported. We used 18 cycles 

258 for the preamplification PCR in step 14, added lng of cDNA to the Nextera XT 

259 reactions in step 28, and used 6 and 8 cycles for the final enrichment in step 33 

260 (experiments 2 and 3, respectively). 

261 Libraries were quantified using a combination of Qubit High Sensitivity 

262 DNA (Life Technologies) and Bioanalyzer (Agilent Technologies, Sunnyvale, 

263 CA) readings, then pooled to equalize index concentration. Due to a pooling 

264 error in experiment 2, the TruSeq libraries were included at much higher abun- 

265 dance. Pooled libraries were then submitted to the Vincent Coates Genome 

266 Sequencing Laboratory for 50bp single-end sequencing according to standard 

267 protocols for the Illumina HiSeq 2500. Bases were called using HiSeq Control 

268 Software vl.8 and Real Time Analysis v2.8. 

269 4.2 Mapping and Quantification 

270 Reads were mapped using STAR [5] to a combination of the FlyBase reference 

271 genome version 5.54 for D. melanogaster and D. virilis [20]. We randomly sam- 

272 pled the mapped reads to use an equal number in each sample compared. We 

273 used HTScq (command line options "htseq-count — idattr='gene\_name' — stranded=no — sorted=pos 

274 to count absolute read abundance per gene [1]. 
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Figure 1: Comparison of linearity between different RNA-seq pro- 
tocols. A) Normalized levels of gene expression A across samples using the 
TruSeq protocol, where each line is for a different gene. B-E) Distributions of 
slopes, intercepts, and correlation coefficient for linear regressions of the data, 
as in panel A. -^3 
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A) SMART-seq2 — 2.5x reduction B) SMART-seq2 — 5x reduction 




Figure 2: Distributions of slopes, intercepts, and correlation coefficients for 
experiment 3. Nextera XT reactions were reduced in volume by the indicated 
amount. 




Figure 3: Estimating the source of preamplification noise. Plotted 
are the estimated slopes for each of the 3 samples. The 2.5 x and 5x dilution 
samples used the same preamplified cDNA, but different tagmentation reactions, 
whereas the Full Size sample used different preamplification and tagmentation 
reactions. 
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